AITopics | deep learning versus kernel

Collaborating Authors

deep learning versus kernel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Neural Information Processing SystemsDec-23-2025, 23:33:50 GMT

In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight expansion of the network at initialization. Standard training, however, diverges from its linearization in ways that are poorly understood. We study the relationship between the training dynamics of nonlinear deep networks, the geometry of the loss landscape, and the time evolution of a data-dependent NTK. We do so through a large-scale phenomenological analysis of training, synthesizing diverse measures characterizing loss landscape geometry and NTK dynamics. In multiple neural architectures and datasets, we find these diverse measures evolve in a highly correlated manner, revealing a universal picture of the deep learning process. In this picture, deep network training exhibits a highly chaotic rapid initial transient that within 2 to 3 epochs determines the final linearly connected basin of low loss containing the end point of training. During this chaotic transient, the NTK changes rapidly, learning useful features from the training data that enables it to outperform the standard initial NTK by a factor of 3 in less than 3 to 4 epochs. After this rapid chaotic transient, the NTK changes at constant velocity, and its performance matches that of full network training in 15\% to 45\% of training time. Overall, our analysis reveals a striking correlation between a diverse set of metrics over training time, governed by a rapid chaotic to stable transition in the first few epochs, that together poses challenges and opportunities for the development of more accurate theories of deep learning.

deep learning versus kernel, loss landscape geometry, time evolution, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Review for NeurIPS paper: Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Neural Information Processing SystemsJan-23-2025, 18:01:10 GMT

Additional Feedback: Minor issues *Visualization method of Figure 1: I am not sure how the authors depict this paper. Is it based on PCA of trajectories? It is also unclear why linear lines give these trajectories. It is just a linear regression with the Taylorized model (2). More technically speaking, when we use data-dependent NTK in a linearized model, the positive definiteness of this NTK is non-trivial and the equivalence to the kernel regression becomes unclear.

deep learning versus kernel, loss landscape geometry, neural tangent kernel, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

Review for NeurIPS paper: Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Neural Information Processing SystemsJan-23-2025, 18:01:02 GMT

The reviews for this paper were overall positive. The paper presents a empirical inquiry of the landscape of the loss of the data-dependent neural tangent kernel. The authors examine dynamics of the kernel, the loss landscape, comparing the learned kernel to corresponding neural networks. The reviewers appreciated that the evaluation of the so-called'parent-child spawning' phenomena and the approximation accuracy of data-dependent neural tangent kernels. The authors made a laudable effort to put to question common preconceptions about neural tangent kernels through an extensive set of numerical experiments leading to interesting empirical observations.

deep learning versus kernel, kernel, loss landscape geometry, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Neural Information Processing SystemsOct-10-2024, 01:12:40 GMT

deep learning versus kernel, loss landscape geometry, time evolution, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback